AITopics | attack surface

Collaborating Authors

attack surface

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Measurement Study of Model Context Protocol Ecosystem

Guo, Hechuan, Hao, Yongle, Zhang, Yue, Xu, Minghui, Lv, Peizhuo, Chen, Jiezhi, Cheng, Xiuzhen

arXiv.org Artificial IntelligenceNov-18-2025

The Model Context Protocol (MCP) has been proposed as a unifying standard for connecting large language models (LLMs) with external tools and resources, promising the same role for AI integration that HTTP and USB played for the Web and peripherals. Yet, despite rapid adoption and hype, its trajectory remains uncertain. Are MCP marketplaces truly growing, or merely inflated by placeholders and abandoned prototypes? Are servers secure and privacy-preserving, or do they expose users to systemic risks? And do clients converge on standardized protocols, or remain fragmented across competing designs? In this paper, we present the first large-scale empirical study of the MCP ecosystem. We design and implement MCPCrawler, a systematic measurement framework that collects and normalizes data from six major markets. Over a 14-day campaign, MCPCrawler aggregated 17,630 raw entries, of which 8,401 valid projects (8,060 servers and 341 clients) were analyzed. Our results reveal that more than half of listed projects are invalid or low-value, that servers face structural risks including dependency monocultures and uneven maintenance, and that clients exhibit a transitional phase in protocol and connection patterns. Together, these findings provide the first evidence-based view of the MCP ecosystem, its risks, and its future trajectory.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.25292

Country:

Asia > China (0.05)
Asia > Singapore (0.04)
North America > United States > California > Orange County > Irvine (0.04)
North America > Trinidad and Tobago > Trinidad > North Atlantic Ocean (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Commercial Services & Supplies > Security & Alarm Services (0.67)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

The Dark Side of LLMs: Agent-based Attacks for Complete Computer Takeover

Lupinacci, Matteo, Pironti, Francesco Aurelio, Blefari, Francesco, Romeo, Francesco, Arena, Luigi, Furfaro, Angelo

arXiv.org Artificial IntelligenceNov-5-2025

The rapid adoption of Large Language Model (LLM) agents and multi-agent systems enables remarkable capabilities in natural language processing and generation. However, these systems introduce security vulnerabilities that extend beyond traditional content generation to system-level compromises. This paper presents a comprehensive evaluation of the LLMs security used as reasoning engines within autonomous agents, highlighting how they can be exploited as attack vectors capable of achieving computer takeovers. We focus on how different attack surfaces and trust boundaries can be leveraged to orchestrate such takeovers. We demonstrate that adversaries can effectively coerce popular LLMs into autonomously installing and executing malware on victim machines. Our evaluation of 18 state-of-the-art LLMs reveals an alarming scenario: 94.4% of models succumb to Direct Prompt Injection, and 83.3% are vulnerable to the more stealthy and evasive RAG Backdoor Attack. Notably, we tested trust boundaries within multi-agent systems, where LLM agents interact and influence each other, and we revealed that LLMs which successfully resist direct injection or RAG backdoor attacks will execute identical payloads when requested by peer agents. We found that 100.0% of tested LLMs can be compromised through Inter-Agent Trust Exploitation attacks, and that every model exhibits context-dependent security behaviors that create exploitable blind spots.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2507.0685

Country:

North America > United States (0.46)
North America > Mexico (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols

Yang, Yixuan, Wu, Daoyuan, Chen, Yufan

arXiv.org Artificial IntelligenceOct-10-2025

Large Language Models (LLMs) are increasingly integrated into real-world applications via the Model Context Protocol (MCP), a universal, open standard for connecting AI agents with data sources and external tools. While MCP enhances the capabilities of LLM-based agents, it also introduces new security risks and expands their attack surfaces. In this paper, we present the first systematic taxonomy of MCP security, identifying 17 attack types across 4 primary attack surfaces. Our benchmark is modular and extensible, allowing researchers to incorporate custom implementations of clients, servers, and transport protocols for systematic security assessment. Experimental results show that over 85% of the identified attacks successfully compromise at least one platform, with core vulnerabilities universally affecting Claude, OpenAI, and Cursor, while prompt-based and tool-centric attacks exhibit considerable variability across different hosts and models. In addition, current protection mechanisms have little effect against these attacks. Large language models (LLMs) are transforming the landscape of intelligent systems, enabling powerful language understanding, reasoning, and generative capabilities. To further unlock their potential in real-world applications, there is an increasing demand for LLMs to interact with external data, tools, and services (Lin et al., 2025; Hasan et al., 2025). The Model Context Protocol (MCP) has emerged as a universal, open standard for connecting AI agents to diverse resources, facilitating richer and more dynamic task-solving. However, this integration also introduces a broader attack surface: vulnerabilities may arise not only from user prompts (such as prompt injection (Shi et al., 2024)), but also from insecure clients, transport protocols, and malicious or misconfigured servers (Hasan et al., 2025). As MCP-powered agents increasingly interact with sensitive enterprise systems and even physical infrastructure, securing the entire MCP stack becomes critical to prevent data breaches, unauthorized actions, and real-world harm (Narajala & Habler, 2025).

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.1322

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

Hybrid Deep Learning-Federated Learning Powered Intrusion Detection System for IoT/5G Advanced Edge Computing Network

Baidar, Rasil, Maric, Sasa, Abbas, Robert

arXiv.org Artificial IntelligenceSep-22-2025

The exponential expansion of IoT and 5G-Advanced applications has enlarged the attack surface for DDoS, malware, and zero-day intrusions. We propose an intrusion detection system that fuses a convolutional neural network (CNN), a bidirectional LSTM (BiLSTM), and an autoencoder (AE) bottleneck within a privacy-preserving federated learning (FL) framework. The CNN-BiLSTM branch captures local and gated cross-feature interactions, while the AE emphasizes reconstruction-based anomaly sensitivity. Training occurs across edge devices without sharing raw data. On UNSW-NB15 (binary), the fused model attains AUC 99.59 percent and F1 97.36 percent; confusion-matrix analysis shows balanced error rates with high precision and recall. Average inference time is approximately 0.0476 ms per sample on our test hardware, which is well within the less than 10 ms URLLC budget, supporting edge deployment. We also discuss explainability, drift tolerance, and FL considerations for compliant, scalable 5G-Advanced IoT security.

artificial intelligence, detection, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2509.15555

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.46)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Surveying the Operational Cybersecurity and Supply Chain Threat Landscape when Developing and Deploying AI Systems

Smith, Michael R, Ingram, Joe

arXiv.org Artificial IntelligenceAug-29-2025

The rise of AI has transformed the software and hardware landscape, enabling powerful capabilities through specialized infrastructures, large-scale data storage, and advanced hardware. However, these innovations introduce unique attack surfaces and objectives which traditional cybersecurity assessments often overlook. Cyber attackers are shifting their objectives from conventional goals like privilege escalation and network pivoting to manipulating AI outputs to achieve desired system effects, such as slowing system performance, flooding outputs with false positives, or degrading model accuracy. This paper serves to raise awareness of the novel cyber threats that are introduced when incorporating AI into a software system. W e explore the operational cybersecurity and supply chain risks across the AI lifecycle, emphasizing the need for tailored security frameworks to address evolving threats in the AI-driven landscape. W e highlight previous exploitations and provide insights from working in this area. By understanding these risks, organizations can better protect AI systems and ensure their reliability and resilience.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2508.20307

Country: North America > United States (1.00)

Genre:

Overview (0.93)
Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.94)
Government > Military > Cyberwarfare (0.92)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

Prefill-level Jailbreak: A Black-Box Risk Analysis of Large Language Models

Li, Yakai, Hu, Jiekang, Sang, Weiduan, Ma, Luping, Nie, Dongsheng, Zhang, Weijuan, Yu, Aimin, Su, Yi, Huang, Qingjia, Zhou, Qihang

arXiv.org Artificial IntelligenceAug-27-2025

Large Language Models face security threats from jailbreak attacks. Existing research has predominantly focused on prompt-level attacks while largely ignoring the underexplored attack surface of user-controlled response prefilling. This functionality allows an attacker to dictate the beginning of a model's output, thereby shifting the attack paradigm from persuasion to direct state manipulation.In this paper, we present a systematic black-box security analysis of prefill-level jailbreak attacks. We categorize these new attacks and evaluate their effectiveness across fourteen language models. Our experiments show that prefill-level attacks achieve high success rates, with adaptive methods exceeding 99% on several models. Token-level probability analysis reveals that these attacks work through initial-state manipulation by changing the first-token probability from refusal to compliance.Furthermore, we show that prefill-level jailbreak can act as effective enhancers, increasing the success of existing prompt-level attacks by 10 to 15 percentage points. Our evaluation of several defense strategies indicates that conventional content filters offer limited protection. We find that a detection method focusing on the manipulative relationship between the prompt and the prefill is more effective. Our findings reveal a gap in current LLM safety alignment and highlight the need to address the prefill attack surface in future safety training.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2504.21038

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Heterogeneity-Oblivious Robust Federated Learning

Zhang, Weiyao, Li, Jinyang, Song, Qi, Wang, Miao, Lin, Chungang, Luo, Haitong, Meng, Xuying, Zhang, Yujun

arXiv.org Artificial IntelligenceAug-7-2025

--Federated Learning (FL) remains highly vulnerable to poisoning attacks, especially under real-world hyper-heterogeneity, where clients differ significantly in data distributions, communication capabilities, and model architectures. Such heterogeneity not only undermines the effectiveness of aggregation strategies but also makes attacks more difficult to detect. Furthermore, high-dimensional models expand the attack surface. T o address these challenges, we propose Horus, a heterogeneity-oblivious robust FL framework centered on low-rank adaptations (LoRAs). Rather than aggregating full model parameters, Horus inserts LoRAs into empirically stable layers and aggregates only LoRAs to reduce the attack surface. We uncover a key empirical observation that the input projection (LoRA-A) is markedly more stable than the output projection (LoRA-B) under heterogeneity and poisoning. Leveraging this, we design a Heterogeneity-Oblivious Poisoning Score using the features from LoRA-A to filter poisoned clients. For the remaining benign clients, we propose projection-aware aggregation mechanism to preserve collaborative signals while suppressing drifts, which reweights client updates by consistency with the global directions. Extensive experiments across diverse datasets, model architectures, and attacks demonstrate that Horus consistently outperforms state-of-the-art baselines in both robustness and accuracy. Federated Learning (FL) has gained significant traction as a privacy-preserving paradigm for distributed training, enabling clients to collaboratively learn a global model without sharing their raw data [12], [20]. However, the decentralized nature of FL inherently introduces serious security vulnerabilities, making it susceptible to poisoning attacks, in which attackers inject malicious data or local updates. Such attacks pose a particularly insidious threat, as they can stealthily degrade or manipulate the global model over time [29]. For example, perturbing a federated model deployed in vehicular systems could autonomously start the vehicle or execute an emergency brake, thereby endangering human lives and compromising property safety [24].

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.03579

Country:

Europe (0.67)
North America > Canada (0.46)
North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Challenges in GenAI and Authentication: a scoping review

Bezerra, Wesley dos Reis, Bezerra, Lais Machado, Westphall, Carlos Becker

arXiv.org Artificial IntelligenceJul-17-2025

Authentication and authenticity have been a security challenge since the beginning of information sharing, especially in the context of digital information. With the advancement of generative artificial intelligence, these challenges have evolved, demanding a more up-to-date analysis of their impacts on society and system security. This work presents a scoping review that analyzed 88 documents from the IEEExplorer, Scopus, and ACM databases, promoting an analysis of the resulting portfolio through six guiding questions focusing on the most relevant work, challenges, attack surfaces, threats, proposed solutions, and gaps. Finally, the portfolio articles are analyzed through this guiding research lens and also receive individualized analysis. The results consistently outline the challenges, gaps, and threats related to images, text, audio, and video, thereby supporting new research in the areas of authentication and generative artificial intelligence.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2507.11775

Country: South America > Brazil (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
(2 more...)

Add feedback

Entangled Threats: A Unified Kill Chain Model for Quantum Machine Learning Security

Debus, Pascal, Wendlinger, Maximilian, Tscharke, Kilian, Herr, Daniel, Brügmann, Cedric, de Mello, Daniel Ohl, Ulmanis, Juris, Erhard, Alexander, Schmidt, Arthur, Petsch, Fabian

arXiv.org Artificial IntelligenceJul-14-2025

Quantum Machine Learning (QML) systems inherit vulnerabilities from classical machine learning while introducing new attack surfaces rooted in the physical and algorithmic layers of quantum computing. Despite a growing body of research on individual attack vectors - ranging from adversarial poisoning and evasion to circuit-level backdoors, side-channel leakage, and model extraction - these threats are often analyzed in isolation, with unrealistic assumptions about attacker capabilities and system environments. This fragmentation hampers the development of effective, holistic defense strategies. In this work, we argue that QML security requires more structured modeling of the attack surface, capturing not only individual techniques but also their relationships, prerequisites, and potential impact across the QML pipeline. We propose adapting kill chain models, widely used in classical IT and cybersecurity, to the quantum machine learning context. Such models allow for structured reasoning about attacker objectives, capabilities, and possible multi-stage attack paths - spanning reconnaissance, initial access, manipulation, persistence, and exfiltration. Based on extensive literature analysis, we present a detailed taxonomy of QML attack vectors mapped to corresponding stages in a quantum-aware kill chain framework that is inspired by the MITRE ATLAS for classical machine learning. We highlight interdependencies between physical-level threats (like side-channel leakage and crosstalk faults), data and algorithm manipulation (such as poisoning or circuit backdoors), and privacy attacks (including model extraction and training data inference). This work provides a foundation for more realistic threat modeling and proactive security-in-depth design in the emerging field of quantum machine learning.

artificial intelligence, attacker, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.08623

Country: Europe > Germany (0.68)

Genre:

Overview (0.93)
Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

We Urgently Need Privilege Management in MCP: A Measurement of API Usage in MCP Ecosystems

Li, Zhihao, Li, Kun, Ma, Boyang, Xu, Minghui, Zhang, Yue, Cheng, Xiuzhen

arXiv.org Artificial IntelligenceJul-10-2025

The Model Context Protocol (MCP) has emerged as a widely adopted mechanism for connecting large language models to external tools and resources. While MCP promises seamless extensibility and rich integrations, it also introduces a substantially expanded attack surface: any plugin can inherit broad system privileges with minimal isolation or oversight. In this work, we conduct the first large-scale empirical analysis of MCP security risks. We develop an automated static analysis framework and systematically examine 2,562 real-world MCP applications spanning 23 functional categories. Our measurements reveal that network and system resource APIs dominate usage patterns, affecting 1,438 and 1,237 servers respectively, while file and memory resources are less frequent but still significant. We find that Developer Tools and API Development plugins are the most API-intensive, and that less popular plugins often contain disproportionately high-risk operations. Through concrete case studies, we demonstrate how insufficient privilege separation enables privilege escalation, misinformation propagation, and data tampering. Based on these findings, we propose a detailed taxonomy of MCP resource access, quantify security-relevant API usage, and identify open challenges for building safer MCP ecosystems, including dynamic permission models and automated trust assessment.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2507.0625

Country:

Europe (0.93)
North America > United States > Illinois (0.14)

Genre: Research Report > New Finding (0.47)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)

Add feedback